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Abstract 
The estimation of the ratio of two means is studied within the neutrosophic theory framework. The variable of 


interest Y is measured in a sample of units and the auxiliary variable X is obtainable for all units using records or 
predictions. They are correlated and the sample is selected using simple random sampling. The indeterminacy of the 
auxiliary variable is considered and is modeled as a neutrosophic variable. The bias and variance of the proposed 


estimator are derived. 
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1. Introduction 


Atanassov (1999), considered how the soft set theory constitutes a general mathematical tool for modeling 
uncertainty and impreciseness. This approach overcomes the need of parameterizing, as in the theories of 
probability, fuzzy sets and rough sets . The neutrosophic theory is based on a new set conceptualization. The roots of 
it may be found in Smarandache (2002, 2003). Neutrosophic set is a generalization of the so-called intuitionistic 
fuzzy sets. This theory is being used in different areas of knowledge. See recent examples as Ajay, (2020), Crespo 
Berti (2020) in modeling real life problems; Hatip ( 2020) and Saqlain et al. (2020) who developed extensions of it. 


Neutrosophic theory may be used for characterizing the indeterminacy of a variable. In real life problems the 
parameters of a population are unknown, and statistical inferential procedures usually replace them by an adequate 
estimation. Therefore, the unknowledge of the value of a parameter is overcome by determining approximate values 
of it. The statistician knows that the obtained values are imprecise. The inaccuracy is measured by using some 


formulas that provide a confidence level or an estimation error of the estimation. See recent discussions in ratio 
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estimation in Bouza and Alomari (2019), where solutions are derived by using auxiliary information for ranking the 


pre-selected units, and Subzar et al. (2019) who derived robust alternative ratio estimators. 


Neutrosophic theory provides a new framework for dealing with impreciseness. It is well known that using a 
Neutrosophic point of view statistical concepts and methods may expanded, see Smarandache (2013, 2014), 
Schweizer (2020), Cacuango et al. (2020). for example. This theory deals not with crisp values of the variables, but 


with set values. 


Neutrosophic statistical data analysis issues have been discussed in different papers, and alternative tools have been 
derived on the domain of neutrosophic sets. In general, Neutrosophic Statistics may be considered as an alternative 
to classical statistics. Decision makers may be concerned with applications where to deal with Neutrosophy in the 
sample is needed. See for example the papers of Hanafy, et al. (2012) who studied neutrosophic correlation in this 
context, Patro and Smarandache (2016) and Alhabib et al. (2018), who dealt with distributions. This paper is highly 
motivated by the contributions of Aslam (2018, 2019b), Aslam, et al (2020b), where sampling based tools for 
control charts neutrosophic random variables were developed as well as the applications of Neutrosophic statistical 


tools reported in Aslam (2019a) and Aslam et al. (2020a). 


A basic structure in neutrosophic theory is the set 


An = (8,7, (8), a(S), Fa EE E x (2) # (1.1) 
T,(€) = degree of membership (1.2) 
I,(€) = degree of indeterminacy, F,(€) = degree of non membership (1.3) 
v4(é)y — [0,1],u=T,1,F;vEe] ~0,1*[ (1.4) 


From a practical point of view the interval [0,1] is used due to the difficulties that arise from using] 70,1*[. 


For developing estimation methods in the context of this study neutrosophic numerical measures play a key role. 


The following results are particularly important. 


Take Vy E R,Z + 0,Z + —W and holds that yZ + yWIz = y(Z + WIz). Then 


1. Wy ER,Z +0,Z #—W holds that -2 = — 1; 
Z+WIz Z+W 


2. YZ +0,Z + —W holds that —2— = (Z + WI,)“4, 





Z+WIz 
3. YQ + 0 holds that m z te 


e 2, W] 
+WIz Z ` Z(Z+W) 4 





4. YZ +0,Z #—W holds that > 
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See Smarandache (2013, 2014) for details. 


The statistical procedures, based on equations and formulas, may be generalized, by using the framework provided 
by neutrosophic theory. Substituting an estimate of a parameter @ by a set value, say Oy, the estimate determines a 
neighborhood of Ê , which includes the point estimate Ê. The impreciseness associated to the point estimator Ê is 


included in the neutrosophic representation of the estimate. 


An open-minded statistician will agree with this modeling. In real life, a sample is selected from the population but 
when the statistics are computed the situation has changed. That is, the true value of the variable is different from 
the value measured and used in the computations. Consider for example a study of the persons infected with Covid- 
19. The researcher selects a sample for developing PSR-tests and the degree of infection of the selected persons is 
measured, say Z. When the laboratory ends processing the collection of tests, the real value of Z in the patients may 
be very different. Hence, the decision making rule should consider that the value Z is imprecise. Determining a 
neighborhood of the computed estimate would be more realistic. Referring only to randomness, for modeling 
uncertainty, is myopic in many cases. Developing particular neutrosophic statistical methods should allow dealing 


with randomness and indeterminacy at the same time. 
Let us provide a theoretical frame for sampling. Take a finite population of items 
U = {u,,..., Uy} (1.5) 
A random sample is selected from U using a sample design d. The probabilistic model is characterized by 


{U, d}, The sample space S is an adequate algebra. The sampler fixes to use as sampling design a certain probability 


measure 

d(s) = probability of observing a subset s S U, Yise5d(s) = 1 (1.6) 
Taking Y as the variable of interest and ui as an item of U the evaluation is represented by Y(ui))=Yi . A random 
sample s of size m is selected by using the sample design d. If u; E€ s then ,Yi is a random variable. In classical 
statistics Y is evaluated in the units selected and an estimate (a statistic) is computed. The unknown parameter @ is 
estimated using an estimator 


6 = O(Y(s)), Y (s) = {Yi Yml Yi = 1,...,m; uy, Es } (1.7) 


The behavior of a statistic is evaluated considering how it behaves in the long run. That is, how is its performance 


when analyzing a long sequence of samples s,,..,S5p,P > œ. 
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Considering neutrosophic statistics an extension of the classical statistics may be developed using the similar 
principles. When the statistician considers that the data is known only approximately , he/she would not be confident 
in the inferences generated using crisp numbers. For example if a question is sensitive the respondent will tend to 
falsify the true value of Y, if carrying the stigma. Then the data is expected to be indeterminate. This situation is 


present in different neutrosophic studies as in the problems analyzed in Cacuango et al. (2020). 


Neutrosophic statistics framework admits that the information provided by the data is not crisp, but ambiguous, 
vague, imprecise and/or incomplete. When the indeterminacy is zero the analysis made using a neutrosophic point of 
view would coincide with the results derived by classical statistics. The practitioner, using neutrosophic statistical 
methods, would interpret and organize the data taking into account the existence of these indeterminacies for 


obtaining some clues on the underlying patterns. 


Some basic ideas on the estimation within a neutrosophic theory framework are developed in the next section. 
Section 3 is concerned with the development of some aspects of estimation theory in a Neutrosophic context. 


Section 4 develops a theory on ratio estimation. Numerical experiments are discussed in the Section 5. 


2. Estimation in a Neutrosophic context 


The proposals of Smarandache (2014 , 2016) discussed how common statistic equations and formulas, due to the 
data indeterminacy of some of the involved variables, may be better replaced by considering that they take values in 
a fixed set. The usual notation is to replace the variable crisp Z by its neutrosophic counterpart Zy. N identifies that 
the variable is “neutrosophic”. The impreciseness on the true value of Z is modeled by considering not a value but a 
set including it. In the applications of statistics the decision makers frequently deal with imprecise data. A 
convenient model seems to be considering Ziy = Z; + Ajlz instead of Zi The statistician measures Zi but has the 
feeling that it is imprecise. It is subject to a basic error, which belongs to an interval Iz, and is “tuned” by A; for each 
“i”. For example, a promoter obtains in the web a value of the index of achievement of a singer, say Zi , but doubts 
that it is correct, as changes in the public preferences are constant. Hence, the change in the singers indexes moves 
in Iz =(-2,5 2,5) but for a particular singer the decision maker considers that it may be 3,5 times larger. Then, is 
recorded Z;+3,5Iz. Note that now the decision maker is able to implement decision rules where impreciseness is 


modeled. 


Consider a sample of size m, the observations determine the sequence {(Z; + AjIz),i=1,..m} . From 


Smarandache (2014 , 2016), is easily deduced that the sample mean is 

= = -= 1 

Zn(m) = Zm + Ámlz = T Oie Zit Dier Ailz) (2.1) 
The deviation of each observation is 
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Dycm)i = (Z; + Ajlz) — (Zm + Amlz) (2.2) 
and its square is given by 


= = 2 
Dmi = [Duoni = (Zi + Ailz) — (Zm + Aml) | (2.3) 
Therefore, the sample variance is 


Syon = ZEC — Zm)? + ZEIZ — Zm) (Ai — Am) — (A: — Am)? Uz (2.4) 


Zu(m) © m^i=1 
because (Zi gm Zii + Ailz = Artz)? ae (Zi = Lia) + [2 (Z; _ Zm)(Ai — Am) + (Ai a Am)? Mz 


(2.1) and (2.4) estimate the neutrosophic parameters 
= = -= 1 
Zy = Zu + Aul: = | Ovi: Z+2%i Ailz) (2.5) 


1 s 1 5 = = 
dyn = ECZ- Zu)? + EEMI - Zu): — A) - (Ai - A)? Uz (2.6) 


The interest of the decision maker is concerned with estimating a ratio of two variables. One of them is measured 
and the other is obtained from records. For example, we may develop an inquiry and obtain an achievement index 
for singers of the disk company. The records obtained in the web provide information on the downloads of a song 
but they are considered imprecise. Then the decision maker has a crisp variable, obtained in the inquiry, and a 


neutrosophic one obtained from the web. 


In the sequel the sample design considered is Simple Random Sample With Replacement (SRSWR). The paper is 
concerned with the estimation of the ratio of a crisp variable and a neutrosophic one. The approximated bias and 


variance of the proposal are derived in the next section. 
3. Some considerations on sampling 


The importance of sampling experiments in different fields of applied sciences is one of the most important 
achievements of statistical inferences. The model considers the existence of a finite population U= {uy, ..., uy}. The 
units are well identifiable. The researcher is interested in estimating a function of a variable Y. It is well defined for 
each individual of the finite population U. It is useful assuming that the experimenter have the knowledge of an 
auxiliary variable X for any individual of U {X(u;) = X;,i = 1,...,M} but Y is unknown. The sampler may use the 
known values of X in the inference process. Under some mild conditions we may develop models, which yield 
more accurate results including the information provided by X. It is common that we should deal with 
indeterminacies in the values of X and is needed to determine how the model is affected. Hence, as the recorded 


values of X are imprecise an alternative is to use the neutrosophic number X+AIx . 


DOT: 10.5281/zenodo.4030309 13 


International Journal of Neutrosophic Science (INS) Vol. 11, No. 1 PP. 9-21, 2020 





The sample experiments are described as usual. For the population U there is a sample space S. The sampler selects 
a sample design d(s). It assigns a probability to each element of the sample space and allows determining the 


probability of selecting a certain unit of U by computing 
P(ui) = Xis € Sli e sy 4(S) (3.1) 


The variable of interest Y is measured in each selected unit, it is a random variable y; that provides a result Y(uj). 
The sampler looks in the collection of recorded values of X, X4, ...,Xņ and obtains the corresponding random value 


xi. Due to the existence of some indeterminacy is considered that it is the neutrosophic random variable 
Xi + Ix lx; = Ail - (3.2) 


Then, in the study should be considered the existence of indeterminacy in the records and acknowledging this fact to 
work within a neutrosophic framework. Hence, taking an individual ui of the population u; w» (Y, X, +1 xr k= 
A,I is obtainable from it. As the unit is selected with probability P(u;) the gathered information is random. Using the 
sample design d the expectation of the result of the random experiment may be determined. Performing simple 


operations with neutrosophic numbers for an observation it is 


Ea(Yo xi + Ie) = Lien, X; + Aple ) PCy) = (Er P(u), Xj- X P(uy) + (23L 4) P(w)) Ix) = 


(Hay, Hix + Hralx ). (3.2) 
In the rest of the paper, without losing in generality, is considered that Aj=A for any j=1,...,M. 


The sampling design to be considered in this paper is Simple Random Sampling (SRS) Without replacement. It is 
defined, see Singh (2003), as 
1, 
as s|| =m 
d(s) = fæ ae 6.3) 


0 otherwise 


In that case 


m 


vi =1,...,M,P(u) == 


(3.4) 
This result also holds if the selection is made with replacement. When M is sufficiently large the difference between 
selecting with or without replacement is negligible, in terms of the inferential processes, when 

M-m 


=e (3.5) 
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A SRS sample is selected and are observed the realizations {y;, x; + Aily , Vi=1,...,m}. Under SRS P(u;) is 


constant and using (3.2) the expectation is 


Ea(Yo xi + lx;) = (F, X + Aly), Z = Š}, Z, Z = ¥,X (3.6) 
because, for this design 
laz =Z = Sa Z=X,Y,A (3.7) 
Using (2.6) is easily derived that the variance of x; + I,,is given by 
1y™” £ 1 M = Z z 
Valai + a) = by = g 2, Hid? +57), [20 -CA — An) = Ci ~ Âu)? = 


= ož + Tx aly (3.8) 


Y is a crisp variable and its variance is 


VVD) = Fy = ZEL, — Fu)? (3.9) 


4. A ratio estimator 


Frequently statistical research must deal with the estimation of a ratio or use it for deriving an estimator of 


the mean or the total of a variable of interest Y. A concomitant, or auxiliary, variable X, correlated with Y, is 
known. The population ratio of them is R = aa Consider that a SRS sample is drawn from the population. A 
pose 


naive estimator is R= 
x 


, where y and X are the sample means of Y and X. The auxiliary variable is 


obtained, commonly, from records or predictions, which usually are outdated and/or subject to 
impreciseness. The impreciseness may be modeled adequately in the context of Neutrosophic Theory. 
Consider the neutrosophic number +Al,, Iy E€ (a,b) . The sampler may model the imprecise knowledge on X by 
determining that for every individual of the population a measurement error interval. For example, if the decision 


maker considers that the percent of tax under-report is between 0% and 20%, is fixing that Iy € (0, 0,2X). 


The variable Y is measured by direct interviewing the individuals of the population. It is a non-neutrosophic 
number. In the previous example, R may be 
e The rate of the mean of the preferences of the public for a song with respect to the monthly mean of 
downloads. 
e The rate of the reported taxes with respect to the previous occasion payments. 
e The ratio of the fuel consumption reports of a transport enterprise in two consecutive months. 


e The ratio of monthly incomes of employees versus the use of their credit card. 
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The classical sampling theory assumes the non-existence of impreciseness in X. The sampler, being uncertain on the 
preciseness of the values of X, fixes (xo, x1) and works with the neutrosophic value Xy = X; + Ajlx,; Ix; € 
(Xo, x1), i = 1, ..., M. Take the crisp ratio of the means of two variables Y and X 


1 
viey Yi 
— NIEU lL 
Raia x 


NruieuXi 


xXx 


(4.1) 


X is known but is expected that it may change when the data is processed for computing. As the values of X come 
from imprecise recorded data, the decision maker fixes a conservative rule using Q = Y,Z = X,W = X. In the 


context of neutrosophic R is denoted as 





Be te GOW, E E AA aa 
= awn itai = HP.Z=X,W=0, 1, € [0,1]. (4.2) 


The alternative neutrosophic population ratio is 


f f z 
Ry = =;+ ggl =R +R*Ig (4.3) 





The operator az = R* is used for modeling the indetermination present in the unknowledge of the true value of X. 


Values of Iz close to zero model the decision maker's confidence that X is the true expectation of X. In many 


occasions is adequate using 
1 
Ig = zE Ix; = Tx, lx E€ (xo, x1). (4.4) 
The decision maker usually fixes xo=0. 


Once a sample is selected the sample mean of Xy is computed 


1 


XN = x + Amlx; x = z mist Xi (4.5) 
as well as 

— 1 

I = zizi (4.6) 


The need of estimating a ratio of two variables in this context suggests using Am = xX. Then, the neutrosophic 


estimator is 
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Ry =^= 2+ BEEN =rt+r'ly (4.7) 


X24% 


3 [= 


Take the first term and analyze 


r-R= (4.8) 


tal j= 
bi ian] 


Consider that Az= 2 = X,Y. As |Ag< 1| with lim,_,.. AE = © is valid developing A;zin the Taylor Series 


r=R(1+A,)(1+A,) 71 = R[1 + A, — Ay + AZ — AJA; + O(AZ)] 


2 
The expected values of the summands are Ey(Az) = 0,z = x, y Eqg(A2) = or Eq(AyAz) = “= and is derived that 


2 
EG — R) = == (4.9) 


For the second term 


(r* = RVI = (Ay + Abe — AyAag + O(4az)) Ix (4.10) 
where 


Ae (x7 +x)-(X2+X) 
Dg X2+X 


(4.11) 


The approximation is valid assuming that Ap; 0 for a sufficiently large sample size. The first term of the expected 


value is E4 (A5) = 0. Note that 





_ ((%?-X?)+(#-X) 2 
a= (“Sa ) (4.12) 
Its design expectation is 
1 > 5 A z2, p 
Ea (Aba) = ayga Hra — F) + (ues — F°) + Ë (1 — 28? + ¥))| (4.13) 
On the other hand 
_ (9-V\ ( (2 +x)-(K7 +X) 
As Aps= ( 7 )( X24¥ ) Ge) 
and 
1 (1 1 s 
Eq (Ay Max)  =(2) (S45) (Maye + ox — Yog) (4.15) 


Hence we may state that 
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(ce - 89) + (gs — X2) +É (1 — 208? + 2))] +C (gts) (moze + 








Bias (Ry) = - 2x] + 


a mxXY Fah 


Oy, — Fo’) Ix (4.16) 


Then, we have proved the following result. 


Proposition 1. The expectation of the estimator Ry = Z = 2 +3 ae YI = =r+tr'ly of the neutrosophic ratio Ry = 
N 
Z+ agi x = R + R*l1x, when the sample is selected using simple random sampling, is approximately 





Ea(Rw) = Ry + E 7 a t aa |fe- X*) + (Hes —X°) + me 2? + X))|+ 10) (sy) (522 $ 


mXY. 


3z T re?) Ix; Hgt = Eq (X°), (4.17) 


when the sample size is sufficiently large for accepting that both A;2 and A, are approximately equal to zero for a 


large sample size m. 


For deriving the Mean Squared Error (MSE) consider 





5 5 7 z 2 
(Ru - Ru)? = (Z+ Sole - 2- le) = (RF ry Ryly)? = (r — R)? + (r*Ix-R}lx)? + 


24% X  x24+x 


2((r — R)(r*Ix—Rġlx)) (4.18) 


Developing the first term in taylor series is obtained 





(r-RP = As + Az. — 2Ay Ay (4.19) 
Hence 

oe of oO 
Ear —R)? = T + = -2 a (4.20) 


The second term is 
(r* — R*)P ly = (A + Abg — 2ApyAaz + O(Apz)) x (4.21) 


and its expectation is given by 


E — RY) = | + [Wes - Z”) + (ugg — X?) +Ê 4 (1-28? + 2))| ays e =) (5) (o532 + 


my2 Faw 


z x — Yo?)| Ix (4.22) 
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Developing the third term is obtained 


((r — R)(r*ly—Ryly)) = (Ay — Ay + AZ — AgAz) (Ay + Abg — AngApz )Ix =S (AZ — AyAg )Ix (4.23) 


because only the terms of order t<2 in the Taylor Series are considered as significant. Therefore 





Eq((r — R)(r*Ix—Rily)) = (25 — 2) J, (4.24) 


my2 mY X 
These results are used for proving the following proposition 


Proposition 2. Under the set of hypothesis of proposition 1 the approximate MSE of 











D = Ie y ad * 
Rees te eat ee (4.25) 
is 
2 2 
5 Oy Ox Oyx 
MSE(Ry) = + -2 
(Rn) |: mX? “mY X | 
207 1 = > ož STTS 
a a Xi a — X?) + (1-2? +X 
+ et ee ee ) + (uza — X°) + —-(1 — 2(X? + X)) 





171 1 p Ovy 
2 
oap) (gage) Cre tont) Seg Ja 
= Mc + Myn (4.26) 


Note that, if Iy = 0 the classical sampling results on ratio estimators are obtained. 
5. Numerical studies 


Only one of the multiple theoretical challenges of developing neutrosophic counterparts of sampling models is 
considered in this paper. The estimation of a ratio when the auxiliary variable is neutrosophic poses a set of 


theoretical problems to be solved for other classes of ratio estimators. 


With the aims of illustrating the behavior of proposal, data obtained from four real life problems are analyzed 


numerically. They are: 


P1. 230 persons with ages in the interval 15-35 were questioned on the preferences for 5 songs. The reports (Y) 
were measured in a scale 1-10. The mean of the daily downloads of the songs in the last 30 days was the auxiliary 


variable X. The manager of a record company is the decision maker. 
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P2. The farmers tax-report was the variable of interest Y and X was the last month tax-payment. The population 
size was 450. The consensus of the specialists of the state office performed the role of the decision maker. 

P3. The fuel consumption report of a fleet of 76 vehicles in two consecutive months was measured. Y was the 
consumption in the actual month and X the report in the previous month. The owner of the enterprise acted as 
decision maker. 

P4. The monthly incomes of 117 employees of an enterprise was Y and X was the total amount of operations with 


their credit cards. The owner of the enterprise acted as decision maker. 


The researchers had a complete knowledge of Y and X. Hence was possible to compute the values of the involved 
parameters. Ry was obtained from the sample results and compared with the true value of Ry . B random samples 
of size m were selected from each population and the accuracy of the estimates was measured computing 

Apj = soealr = R|; + ulr = R*|slx,= acj + Oyj j =1,..,4 (5.1) 

Using the population information was computed (4.26) for each population 

MSE(Ry) = Mcj + Myj = MSE; j = 1, ...,4 (5.2) 

See the results of the study in Table 1. 


Table 1. Results of the Monte Carlo experiments 















































Population | m B acj ayj apj Mc Myj MSE; 

1 25 | 1000 | 0,324 | 0,072-0,250 | 0,396-0,774 | 1,561 | 6,331-7,117 | 7.892-8,678 

2 50 | 1500 | 1,920 | 0,400-0.652 | 2,320-2,552 | 5,452 | 0,851-1,807 | 6,303-7,259 

3 15 | 2600 | 2,690 | 0,841-1,741 | 3,531-4,431 | 8,785 | 6,263-11,549 | 15,048-20,334 
4 25 | 2000 | 1,130 | 0,757-0,965 | 1,887-2,095 | 1,873 | 6,022- 7.625 | 7,895-8,498 





A lecture of the lines of the previous table suggests that the samples averaged an absolute difference between the 
estimate and the true value, which take values in the corresponding fifth column. The MSE of the methods appears 
in the eighth column. The decision makers fixed the corresponding Ix. They considered that was obtained a good 


description of the impreciseness of the estimates rules by their appreciations. 
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